Comments for authors

This paper reveals the connection between the password guessing and data compression, providing fresh insights into the security of human passwords in real-world situations. The authors noticed that password guessing has been studied in the fields of cybersecurity and information theory but with little cross-pollination. They thus formulate the password guessing process using a compression coding model. The models reframes the task of password guessing as the construction of a strategy for determining the order of guesses. Their proposed CompGuess approach applied with adaptive arithmetic coding effectively tackles the challenge of synthesizing multiple guessing methods within the compression process of guessing, revealing the security of passwords in real-world scenarios.

Strengths
---------
1. **Problem formulation**. This study offers valuable insights by formulating the password guessing process as a data compression problem, providing an innovative perspective to better understand the guessability of passwords.

2. **Attack models**. The paper defines the password probability space model, attack model and relevant metrics for the compression coding model in the context of the information theory. The definitions of the three hierarchical attackers and the quantification of the two gaps can serve as basis in the problem formulation for future work in this area.

3. **Validations**. The unification of password guessing and coding theory is supported by concrete theoretical derivations and experimental validations.

4. **Real-world scenarios**. The paper considers real-world scenarios of password guessing attack, including the state-of-the-art guessing methods and multiple guessing scenarios (e.g., online guessing (without targeted information), targeted online guessing, and offline guessing).

Weaknesses
----------
1. More attention is needed on the inspiration and practical deployment of this work aimed at maintaining and enhancing password security.
2. Writing needs significant improvement.

Detailed comments for authors
-----------------------------
This paper investigates the issue of password guessing using techniques from information theory. It provides valuable insights and its problem modeling and formulation have practical implications for the password research community. Below, I provide a detailed review of various aspects of this paper, along with my comments and suggestions for the authors to consider.

**Evaluation**

The evaluation metrics used in this work for different password guessing scenarios are reasonable, but further explanations are needed. It is noted that offline guessing involves evaluation metrics related to the hash cost, while online guessing considers the number of successful guesses under a fixed number of attempts. It is recommended to provide additional explanations to address the differences between these two scenarios.

**Security insights.** 

1. The authors address the concern of password resistance to guessing attacks in real-world scenarios. However, the paper lacks specific recommendations for improving password security. While Appendix D mentions a method for measuring password guessability, it does not provide clear explanations regarding the regarding the implementation procedures, such as whether the method should be applied on the client-side or server-side，concerns on potential user privacy disclosures.

2. The paper seems to fall short in predicting the guess number for a password, particularly for a large guess number, similar to the Monte Carlo method. Can this work propose improvements to the Monte Carlo method in the context of multiple guessing methods?

"Monte Carlo Strength Evaluation：Fast and Reliable Password Checking", CCS '15.

**Presentation**

The paper's presentation needs improvement. Its overall organization is commendable, but its flow is not very smooth. The same piece of information is distributed into multiple paragraphs and even sections sometimes. It is recommended to consolidate the evaluation metrics for different password guessing scenarios into a single table for better clarity.

In a nutshell, I suggest a "weak accept" to this paper.